Computer vision uses image-capture-and-processing algorithms {computer vision algorithms} to find features and objects.
image capture
Cameras capture images pixel by pixel, with white, red, green, or blue intensity. Image algorithms can rotate, change dimensions, enhance brightness, and enhance contrast.
counting pixels
Counts light or dark pixels.
thresholding
Converts gray to black or white.
segmenting
Locates or counts textures and/or parts.
detecting edges
Finds object edges.
discovering blobs
Inspects image for connected white or black pixels to establish image reference points and regions.
recognizing components
Extracts simple three-dimensional geometric figures from image.
recognizing patterns
Locates features or objects, which may have different rotations or sizes and overlap or occlude each other.
Image points belonging to same feature are near each other. Parameter-space points belonging to same or similar features are near each other. Nearest neighbors can form point, line, or angle clusters {clustering algorithm}. Nearest-neighbor algorithms measure Euclidean distance, or other distance metrics, to find nearest-neighbor points and make clusters. Using distance measure can group features.
types
Clustering algorithms include hierarchical, self-organizing maps, K-means, fuzzy C-means, and expectation maximization. Error-weighted clustering retrofits clustering algorithms to use error-propagation information.
Self-organizing maps can group into equal categories.
process
Clustering can start with all features in one category and split features off {divisive clustering} or start with individual features and group them into classes {agglomerative clustering}. Clustering can use feature information to start or modify clustering {supervised clustering} or use only distances {unsupervised clustering}.
distance metric
Different hierarchical clustering methods use different ways to calculate distance between two clusters.
Minimum distance between cluster features {nearest-neighbor clustering} {single-linkage clustering} makes loose clusters with long branches containing few samples. Maximum distance between cluster features {furthest-neighbor clustering} {complete-linkage clustering} makes tight similar-size clusters with short branches.
Clustering can use average or median distance between features {unweighted pair-group method average} (UPGMA) or weighted-average point {centroid, cluster}.
For widely varying cluster sizes, average distance between features has weight {weighted pair-group average}: cluster feature number.
Clustering can use distance between cluster averages {within-groups clustering}.
distance metric: city-block distance
Distance measure between structure-space points is the same as Manhattan distance.
distance metric: Lp-metric
Distance measure between structure-space points is the same as Minkowski distance.
distance metric: Mahalanobis distance
Structure-space points have distances between them.
distance metric: Manhattan distance
Distance measure between structure-space points is the same as city-block distance.
distance metric: Minkowski distance
Distance measure between structure-space points is the same as Lp-metric.
supervised method
Classification can use already known patterns and clusters.
hierarchical clustering
Unsupervised, agglomerative clustering method {hierarchical clustering} measures distances between all points and places features in cluster hierarchy that looks like tree diagram {dendrogram, clustering}. Hierarchical clustering can make hierarchies and assign shared feature. Single features, and feature groups, are clusters. After calculating distance for cluster pairs, the closest clusters merge into one cluster, reducing cluster number by one. After calculating distance again for cluster pairs, the closest clusters merge into one cluster, and so on, until all features are in one, top-level cluster. Hierarchical clustering methods include centroid linkage, complete linkage, single linkage, and Ward's method. Ward's method calculates cluster mean and sum of squared differences from mean, for all cluster points. The next cluster is pair that gives smallest increase in sum of squared differences.
non-hierarchical clustering
Features can also be in different classes with no hierarchy {non-hierarchical clustering}.
non-hierarchical clustering: k-means clustering
Non-hierarchical, unsupervised method places features into a fixed number of clusters. First, it randomly assigns features to clusters. It calculates distances between feature pairs in cluster to find average cluster expression vector. It calculates distances between cluster pairs using average cluster expression vector. Features move to other clusters in turn, and method calculates all feature distances. Features stay in new cluster if distances between cluster feature pairs and between cluster pairs decrease.
Supervised k-means clustering first assigns features to clusters based on feature information and then proceeds as for unsupervised k-means clustering.
non-hierarchical clustering: self-organizing map
Non-hierarchical, unsupervised method places features in cluster based on nearness to cluster reference vector. First, cluster number is set. Starting from random vector, cluster reference vector converges by better partitioning feature data. It calculates distances between each expression vector and reference vectors and assigns feature to one cluster.
non-hierarchical clustering: principal component analysis
Non-hierarchical, unsupervised methods {principal component analysis, vision} (PCA) {singular value decomposition, vision} can combine feature dimensions linearly to reduce expression-space dimensions, remove redundancy, and average data. Similar unsupervised linear methods {factor analysis, computer} (FA) can look for significant factors in factor sets and find one, two, or three combined factors. It includes principal component analysis, correspondence analysis, spectral mapping, and non-linear mapping. Another similar technique {principal coordinate analysis} combines coordinates to make fewer dimensions. PCA, factor analysis, and principal coordinate analysis project data onto lower-dimension space by eliminating dimensions and dimension combinations with low significance. Number of remaining dimensions gives number of clusters to use for k-means clustering or self-organizing maps.
non-hierarchical clustering: support vector machine
Supervised clustering methods can use feature-vector training sets, group similar-function features into clusters, and group other-function features outside cluster. It tries to find cluster-specific features. Test features are in or outside cluster.
Features split space into two regions by making boundaries {hyperplane}. Hyperplanes can partition features so regions {soft margin} near hyperplanes have ambiguous examples.
Features can partition higher spaces {feature space}, mapped from feature vectors. Feature-space distances are metric and use special function {kernel function}. The best partition typically increases kernel function from simple to complex.
class analogy
Class analogy is a SIMCA method.
cluster sampling
Sampling can be from clusters equally.
cluster significance analysis
Using discrete or continuous data and embedded data can put compounds into groups by activity level. CSA locates small clusters in large spaces.
Cone and Hodgkin similarity index
Method measures molecular similarity.
discriminant-regression model
Model locates small clusters in large spaces.
distance-b program
Method locates small clusters in large spaces.
Jarvis-Patrick method
Structures can cluster in large databases by rating different features by similarity.
k-nearest neighbor
Supervised method calculates new object distance from all other objects to locate small clusters in large spaces.
partitioning
Process merges individuals into groups or splits whole into clusters.
similarity measure
Value can compare distances.
single-class discrimination
Method locates small clusters in large spaces.
Soft Independent Modeling of Class Analogies (SIMCA)
Supervised method uses model to find region boundaries or envelopes and locates small clusters in large spaces.
trend-vector analysis
Using activity and descriptor correlation vector ranks different features by similarity.
Image regions have sub-regions, which have different numbers of same-intensity pixels. Matrices can represent regions, and matrix cells can represent sub-regions. Cell values are number of same-intensity pixels. Matrices then have differences between cell sub-region values {Haar-like feature}, and matrices represent wavelets {Haar wavelet}. Regions are clusters or neighborhoods, such as rectangles or spheres, and so each region type has unique wavelet. Region sets have waves.
Images have points that do not cluster and do not belong to features. Outlier algorithms {outlier algorithms} use linear regression and best-fit to find outliers and discard them.
Methods {Potts model} can minimize energy by making two states either equal or unequal, with constant penalty for unequal and no penalty for equal.
Message-passing labeling inference algorithms {belief propagation} can compute many graphical-distribution statistics, each with few values, to calculate disparities by finding large-neighborhood minima. Belief propagation uses sum-product to find minimum or max-product to find maximum a posteriori (MAP) estimate. For stereo vision, Markov random-field model describes disparity, and inference algorithm determines nodes.
Algorithms {binary space partitioning} (BSP) can recursively divide space or polygon into two regions using hyperplanes, to make halves, quarters, eights, sixteenths, and so on.
orientation
Hyperplanes can have any orientation, making unequal regions. Hyperplanes that cut at medians make both regions equal.
polygons
Polygons can have angles greater than 180 degrees {reflex angle, polygon} or less than 180 degrees. Dividing polygon recursively makes regions with angles less than 180 degrees {convex set}.
recursion
Recursion steps define trees {BSP tree} and make stored lists {visibility list} that order polygons from front to rear. Convex sets become smaller until they include only point {BSP-tree leaves}.
multiple hyperplanes
Binary space partitioning can use hyperplane pairs or triples for cuts. Hyperplane pair divides space into four regions {quadtree}. Hyperplane triple divides space into eight regions {octree}.
Methods {iterative closest points method} can use point-sample clouds to align two images.
Modified BSP-tree algorithms {kd-tree} {k-dimensional tree} can use only hyperplanes perpendicular to space axes and use axis sequences, typically splitting at axis or polygon medians. If space has k axes, kd-tree has k cuts and tree branchings. kd-tree algorithms are better for searches using nearest neighbors, because they match space coordinate parameters and split hyperplanes go through points. Because hyperplanes are across axes, all regions have nodes or points.
Gauss-Helmert methods can use least squares {least-squares adjustment} to estimate best fit.
Painters paint background first, then layer objects on canvas from rear to front {painter's algorithm}, drawing over things that become behind.
Stereo images project onto aligned image plane by transformation {rectification of image}.
Cameras can use epipolar transform and absolute conic image in Kruppa equation to find standard metric {self-calibration}.
Robots can find their locations in environments {self-localization}, using self-localization alignment methods (SLAM).
Algorithms {z-buffering} {depth buffering} can store object-part depths for image generation or object recognition. z-buffers {depth buffer} represent two-dimensional images, for object identification. z-culling algorithms compare object-part depths and store object with smallest depth in buffer. The same visual angle covers more space farther away. To control for increasing spread with increasing distance, use variant w-buffers.
Feature detection can use point matching based on feature or boundary matching {feature detection methods}: corner detection, scale-invariant features (SIFT), and speeded-up robust features (SURF). Features are good image-category descriptors.
Algorithms {feature detection algorithms} can detect features.
descriptor
Properties {X-variable, vision} {X descriptor, vision} can describe features and have components.
canonical factor analysis
Factor analysis has a basic method.
centroid method
Factor analysis can use centroids.
Correlation Analysis
Properties and features have relationships.
correspondence factor analysis
Factor-analysis methods can use variable frequencies relative to properties, find chi-square values, and find principal components.
disjoint principal component
Principal components can be independent.
eigenvalue-one criterion
Thresholds can be how many components have eigenvalues greater than one.
eigenvector projection
Unsupervised linear methods can find factors.
Evolutionary Programming
Models can add and subtract randomly selected variables, with crossing-over, and evaluate for "fitness" or best fit. Extinction can happen more or less frequently or in more or fewer species. More-frequent extinctions have fewer species. Values follow power laws, because events can cause few or many extinctions.
evolving factor analysis
Methods can analyze ordered data.
explained variance percentage
Methods can indicate component number required to reach 90% of total variance.
factorial design
Designs can try to ensure design-space sampling, even if one position varies.
Genetic Function Algorithm
Linear property sets can have different values, change values by crossing-over between related genes, and have random changes, to select best fit.
latent variable
Variables can be linear descriptor combinations.
linear discriminant analysis
Supervised methods, in which boundary surface minimizes region variance and maximizes between-region variance, can put compounds into groups by activity level.
linear learning machine
Supervised methods can divide n-dimensional space into regions using discriminant functions.
maximum-likelihood method
Factor-analysis methods can find factors.
multidimensional scaling
Metric or non-metric methods can analyze similarity or dissimilarity matrices to find dimension number and place objects in proper relative positions.
multivariate adaptive regression spline
Non-parametric methods can find factors.
Mutation and Selection Uncover Models
Models can add and subtract randomly selected variables, with no crossing-over, and evaluate for "fitness" or best fit. Low mutation rates allow natural selection to operate on populations to move toward fitter genotypes. Intermediate mutation rates cause population to move toward and away from fitter genotypes. High mutation rates make many genotypes with direction, so high mutation blocks selection processes.
For any mutation rate, if gene number is too great, change rate is too great, and organism becomes extinct {error catastrophe, extinction}. Therefore, gene number has a limit if organisms do not make new species or find new environments.
Perhaps, cells and ecosystems also have upper limits to complexity. Complexity can increase with migration or speciation.
non-linear iterative partial least squares
Unsupervised linear methods can represent data as product of score matrix, for original observations, and loading-matrix transform, for original factors.
non-linear mapping
Topological mapping factor-analysis method uses linear variable combinations to make two or three new variables.
predictive computational model
Property information can predict behavior.
principal-component analysis
Variable principal components can be linear-descriptor combinations. Unsupervised linear methods can represent data as product of score matrix, for original observations, and loading-matrix transform, for original factors. PCA factor-analysis method uses linear variable combinations to make two or three new variables. PCA reduces unimportant variables.
principal-component regression
Singular-value decomposition can find singular-value sets to predict and project regression to latent structures.
principal factor analysis
Modified PCA can find principal factors.
Procrustes analysis
Methods can identify similarity descriptor sets.
QR algorithm
Methods can diagonalize matrices.
rank annihilation
Unsupervised linear methods can find factors.
response-surface method
Three-level designs can have three factors that quantify response and factor relationships. RSM includes MLR, OLS, PCR, and PLS linear designs, non-linear regression analysis, and non-parametric methods such as ACE, NPLS, and MARS.
Scree-plot
Residual variance approaches constancy, and plotted slope levels off, depending on number of components {Scree-test, vision}.
singular-value decomposition
In unsupervised linear methods, correlation matrix can become product of score, eigenvalues, and loading matrices, with diagonalization using QR algorithm.
spectral-mapping analysis
Factor-analysis methods can first take data logarithms to eliminate outliers and then subtract means from rows and columns, to leave only variation, showing which variables are important and how much.
structure space
Spaces can have two or three principal components.
target-transformation factor analysis
Methods can rotate features to match known patterns, such as hypotheses or signatures.
Unsupervised Method
Factors and response variables can relate, without using factor information or predetermined models.
Methods {eight point algorithm} can find structures from motions.
People can recognize six basic facial expressions {facial expression recognition}: anger, disgust, fear, happiness, sadness, and surprise. Expressions have unique muscle activities {Facial Action Coding System}, grouped into Action Parts. Methods detect faces, extract features, and classify expressions. Classifying can use Gabor filters, Bézier volumes, Bayes and Bayesian network classifiers, and Hidden Markov Models.
Gaussian filtering {Kalman filter} can use mean and variance parameters for normal distributions and can increase feature or pixel gain. Kalman filters are parametric, as opposed to particle filters. Kalman filters predict output from input.
First computer-vision stage is to find features, including invariants {local image analysis}. Invariants can be angles, local phase, and orientation.
Distributions can have representations as finite numbers of samples {particle, sample} defined on Markov chains {particle filtering}, rather than using parameters.
Operators {Sobel edge operator} can detect edges.
Methods {support-vector machine} can detect shapes from image segmentation, using color, shape, and distances.
Image features make maxima in parameter spaces. Algorithms {hill-climbing methods} can find local maxima that exceed threshold amount. If maximum is at feature parameter-space location and exceeds threshold, algorithm states that feature is in image and identifies location. Hill-climbing methods can become stuck at local maxima and fail to find more-important maxima.
Hill-climbing algorithms {Broyden-Fletcher-Goldfarb-Shanno method} (BFGS method) can improve quasi-Newton method.
Hill-climbing algorithms {Newton's optimization method} {Newton optimization method} can solve unconstrained non-linear optimization problems using function second-order partial-derivative Hessian square matrices, to find local maxima at locations where derivatives become zero.
Hill-climbing algorithms {quasi-Newton method} can simplify Newton's optimization method by simplifying Hessian matrix.
Hill-climbing algorithms {watershed algorithm} can find maximum gradient from center pixel to eight surrounding pixels and move to pixel with maximum gradient. If new pixel is minimum or is lower than threshold, stop and assign original pixel to same group as second pixel. If new pixel is not minimal and is not lower than threshold, find maximum gradient from second pixel to eight surrounding pixels and move to pixel with maximum gradient.
Algorithms {image processing algorithms} can determine background, cluster, check error and noise, find fiducials, normalize, check crosstalk, find signals, find features, aggregate replicates, and perform statistical analysis. Statistical cluster analysis can identify classes and assign shared feature. Hierarchical clustering can cluster into hierarchies. Self-organizing maps can group into equal categories.
Determining background {background determination algorithm} involves surface-fitting algorithm that detects global background changes across array. Background subtracts from features. Algorithm finds variance within feature integration aperture after global correction. Using average feature density can overestimate background. Putting space between feature blocks can find background.
Algorithms {error model algorithm} can check feature-signal noise, measure background variance, find cross-hybridization, check for spatial crosstalk and spectral crosstalk, measure normalization variation, and study replicates. Error model weights optimize signal-to-noise ratio. It has confidence value. Error flags label too-high variance, hybridization-control variance, high background, rejected pixels, bright neighbors, too-low signal-to-noise ratio, and saturated pixels.
Fiducials {fiducial-finding algorithm} are marks or shapes every 300 pixels, for automatic spot finding and row and column counting, without using quantitation-control scheme. Fiducials must be easy to distinguish from other image features and artifacts. Spotting and shrinking cause non-linear distortions and determine fiducial frequency needed. Row and column drift must be less than half the distance, five or six pixels, between features.
To account for labeling-amount, dye, fluorescent-detection, spotting, RNA-concentration, and sample-quantity differences, systems modify intensities {normalization, results}. Normalization allows comparison among slides and cell extracts.
types
Normalization can normalize on total intensity. Normalization can normalize on means and use ratio statistics. Normalization can use linear regression. Non-linear regression includes local regression, such as Locally Weighted Scatterplot Smoothing (LOWESS). Normalization algorithms use dilution-series controls, dye selection, filter selection, dye-quenching non-linearities, multiple gain settings, photobleaching amounts, and array-to-array normalization.
Strong signals at features can spread to neighboring features and skew weak signals. Features must be far enough apart to prevent more than 0.1% spatial crosstalk. Algorithms {spatial crosstalk mitigation algorithm} can reduce number of strong signals adjacent to weak signals. Algorithms can weight intensity sum across feature depending on neighboring features. Algorithms can use deconvolution with particular instrument.
Algorithms {signal integration algorithm} {signal quantitation algorithm} can use median pixel, integrate intensity, or use pixel-ratio median to generate values. First, algorithm aligns channels, if necessary. Fixed aperture is in both channels. Simultaneous scanning in both channels reduces crosstalk and allows pixel-to-pixel regression, which is more robust to defects. Integration can be only for high signal-to-noise pixel subsets.
Finding sites {spot-finding algorithm} can use three methods: find isolated peaks, align grid, or align centroid. To find isolated features or peaks, first shrink image to convolve it with feature model and then find isolated peaks. To align grid, first find fiducials and then fit rows and columns. To align centroid, first move off grid and find intensity-weighted centroid, if signal-to-noise feature ratio allows.
Algorithms {probe aggregation algorithm} {replicate aggregation algorithm} can average replicates.
Algorithms {prediction algorithms} can predict.
predictors
Factors, properties, or structures contribute to response values.
variable influence on prediction
Methods {variable influence on the prediction} can determine variable importance.
principal property
Variables {principal property} can be linear descriptor combinations.
non-parametric algorithms
Non-parametric algorithms can have alternating conditional expectations. Non-parametric methods {non-linear partial least-squares, vision} can find least squares.
outlier algorithms
Normal-distribution-outlier tests {Dixon's Q-test, vision} can measure ratio of smallest and largest differences.
Normal-distribution outlier tests {Grubbs' s-test, vision} can compare absolute values, of differences between mean and value, divided by standard deviation, to T distribution value.
network
Kohonen topology-preserving network mappings can retain topology. Topological indexes can represent graphs as numbers. Topological Tanimoto indexes can represent graphs as numbers.
rule induction system
IF/THEN statements {rule induction system, vision} can make output from input.
Factors, properties, or structures can correlate with response values {regression algorithms}.
regression
Regression analysis finds property and structure relationships. Multiple linear regression measures linear-component dependence on properties and finds descriptor coefficients. Non-linear regression is a parametric method that finds descriptor coefficients. Ridge regression is another regression method.
correlation
Factors can correlate, with correlation coefficients. Variance-covariance matrices {correlation matrix, vision} are complete, symmetric, square matrices that use property values and structure values, which can scale to normalize data. Partial least-squares can simplify variance-covariance matrix {matrix diagonalization, vision} {matrix bidiagonalization method, regression}.
Spearman rank correlation coefficient can measure molecular similarity.
least-squares
Ordinary least-squares {classical least-squares, vision} {least-squares regression, vision} {linear least-squares regression, vision} {multiple least-squares regression, vision} {multivariate least-squares regression, vision} can find descriptor coefficients by minimizing distances between values and regression line. Inverse least-squares inverts fitting method.
adaptive least-squares
Adaptive least-squares modifies ordinary least-squares by weighting or classes.
adaptive least-squares: fuzzy
Features can be in different classes with different weights.
partial least-squares
PLS uses least-squares to find independent variables and dependencies among variables. PLS maximizes latent-variable and observable covariation. PLS diagonalizes variance-covariance matrix. Multi-block PLS uses groups. Kernel algorithm is about covariation.
partial least-squares: Comparative Molecular Field Analysis
Partial least-squares methods (CoMFA) can analyze grids around sites and find grid-point interactions, to make sampled-point descriptors.
partial least-squares: Generating Optimal Linear PLS Estimations
GOLPE uses PLS and D-optimal design to select variables and then cross-validates.
partial least-squares: SAMPLS algorithm
Partial least-squares and trend vector analysis can work together.
non-least-squares
Non-least-squares methods can detect non-linear relationships.
Statistics has algorithms {statistical algorithms}.
best linear unbiased estimation
Estimates can give smallest variances among estimators.
mean square error
SSE / (observation number + factor number - 1).
SSE
Errors or residuals cause sum of squares of differences between observed and predicted responses.
SSR
Regression causes sum of squares of differences between observed and mean.
SST
Sum of squares of differences between predicted and mean makes total: SST = SSE + SSR.
standard error
Standard error is MSE square root.
Transforms {Hilbert transform} can be for one-dimensional signals.
Features have parameters, such as length, angle, color, distance, radius, and angle. Parameters have continuous-value ranges, such as lengths from millimeters to meters, or discrete-value sets, such as color categories. Features have parameter values that make vectors. For example, if parameter number is four, features have four-vectors, such as (0, 0, 0, 0), (3, 0, 0, 0), or (4, 2, 1, 3).
space
Hough spaces have dimension number equal to parameter number. In the example, Hough space has four coordinates. Hough space points can represent features.
feature extraction
Objects whose feature vector lies near feature point have that feature.
feature extraction: accumulator
Feature extraction can use voting {Hough transform}, to accumulate weights at feature points in Hough space {accumulator space} (Paul Hough) [1962]. After voting, if weight passes threshold at feature point, image has feature.
lines
Parameterized standard line, circle, or ellipse test sets establish feature-point coordinates in accumulator space. For lines, accumulator space can use polar-coordinate radius and angle. Edge-detector algorithms can pre-process images to find edges. Hough transforms can group edge points into lines, circles, or ellipses for identification (Richard Duda and Peter Hart) [1972] (Dana H. Ballard) [1981].
Transforms {pyramid transform} can be for three-dimensional signals and takes high-resolution images and makes low-resolution images.
One-dimensional Hilbert transforms {Radon transform}, at specific orientations, can transform multi-dimensional signals.
Transforms {Riesz transform} can be for two-dimensional signals.
Methods {validation algorithms} can check correlations, predictions, and designs.
bootstrapping validation
Method can use only internal data.
external validation
Other data can pair with model to predict activity.
cross-validation
For all data subsets, algorithm {leave-one-out, vision} {leave-groups-out, vision} can remove data subset and calculate remainder, such as for drug validation.
cross-validation: correlation coefficient
Method can validate and predict data.
fitness function
Validation method {lack-of-fit method} {jackknife validation method, vision} can measure fit, such as for drug validation.
Fisher F-test
Validation method can use F test.
other methods
Methods can be for drug validation {predictive residual sum of squares, vision} {scrambling dependent Y-values, vision} {standard deviation method, vision} {standard error of predictions, vision} {standard error of regression, vision}.
3-Computer Science-Systems-Computer Vision
Outline of Knowledge Database Home Page
Description of Outline of Knowledge Database
Date Modified: 2022.0225